The Prime Number Theorem

Introduction

Prime numbers feel mysterious at first glance. They appear irregular and unpredictable, yet mathematicians have discovered deep patterns in how often they occur.
This article explores those patterns, culminating in one of the great achievements of mathematics: the Prime Number Theorem (PNT).

Key goals of this article:

Understand how mathematicians “count” primes
See what the Prime Number Theorem actually says
Explore why it matters

Why the Distribution of Primes Matters

Prime numbers matter because:

They are the foundation of number theory
They appear in cryptography (e.g., RSA encryption)
They reveal surprising patterns in the integers
They connect to deep ideas in analysis and geometry

Understanding how often primes appear is a central question in mathematics.

Counting Primes: The Prime‑Counting Function $\pi(x)$

To study prime distribution, mathematicians use a function:

$\pi(x)$ = the number of primes less than or equal to $x$

Examples:

$\pi(10) = 4$ (primes: 2, 3, 5, 7)
$\pi(100) = 25$
$\pi(1000) = 168$

Why this function matters:

It gives a precise way to measure prime density
It allows us to compare actual counts with predictions

Historical Attempts to Understand Prime Patterns

Early mathematicians noticed:

Primes become less frequent as numbers grow
But they never stop appearing
Their spacing seems irregular

Key historical figures:

Euclid: proved infinitely many primes
Euler: connected primes to infinite series
Gauss & Legendre: made early guesses about prime density

These early insights paved the way for the Prime Number Theorem.

Gauss and Legendre: Early Conjectures About Prime Density

Both Gauss and Legendre independently studied tables of primes and proposed formulas for $\pi(x)$.

Gauss’s idea

He suggested that primes around a large number $x$ occur with density roughly: $$\frac{1}{\ln x}$$ This leads to the approximation: $$\pi(x) \approx \frac{x}{\ln x}$$

Legendre’s idea

He proposed a slightly different formula: $$\pi(x) \approx \frac{x}{\ln x - 1.08366}$$ Gauss’s version turned out to be the correct leading behavior.

The Prime Number Theorem: Statement and Meaning

The Prime Number Theorem (PNT) states: $$\pi(x) \sim \frac{x}{\ln x}$$

The symbol $\sim$ means that the ratio of the two expressions approaches 1 as $x$ becomes large.

In plain language:

The number of primes up to $x$ is approximately $x / \ln x$
The approximation gets better for larger $x$

This is a profound insight into the structure of the integers.

Understanding $\pi(x) \sim x / \ln x$

Why does this approximation make sense?

Key ideas:

Larger numbers have more potential divisors
So primes become rarer
The function $\ln x$ grows slowly
Dividing by $\ln x$ captures the decreasing density

Interpretation:

Around $x = 1{,}000{,}000$, the density of primes is about $$\frac{1}{\ln(1{,}000{,}000)} \approx \frac{1}{13.8}$$ meaning roughly 1 prime out of every 14 numbers.

How Good Is the Approximation?

Some sample comparisons:

$x$	Actual $\pi(x)$	Approx. $x/\ln x$
100	25	21.7
1,000	168	144.8
1,000,000	78,498	72,382

Observations:

The approximation is always a bit low
But the relative error shrinks as $x$ grows
The trend is remarkably accurate

What the Theorem Doesn’t Tell Us

The PNT does not answer:

Where the next prime is
How large the gaps between primes can be
Whether there are infinitely many twin primes
Whether primes follow any simple pattern

It describes average behavior, not exact positions.

A Glimpse at the Ideas Behind the Proof

The proof of the PNT (completed in 1896) uses advanced tools, but the broad strokes involve:

Studying the Riemann zeta function
Connecting primes to zeros of this function
Using complex analysis to estimate prime density

You don’t need the details to appreciate the result—just the idea that primes are linked to deep analytic structures.

Connections to the Riemann Hypothesis

The Riemann Hypothesis (RH) is one of the most famous unsolved problems in mathematics.

Connections to primes:

RH predicts an even more accurate formula for $\pi(x)$
It would give extremely precise bounds on prime gaps
It would refine our understanding of prime randomness

The PNT is like the “first layer” of this deeper mystery.

Calculator

Defining $\pi(x)$

We can define a custom function for calculating $\pi(x)$

pnt(x) = x / log(x) pnt(100) pnt(1000) pnt(1000000)

Exercises

Try these to reinforce your understanding.

Compute $\pi(20)$ by listing all primes up to 20.
Solution
Compute $\pi(20)$ by listing all primes up to 20.
- Primes $\le 20$:
  $2, 3, 5, 7, 11, 13, 17, 19$
- Count: There are 8 primes.
- Answer: $\pi(20) = 8$.
Compare $\pi(100)$ with the approximation $100 / \ln 100$.
Solution
Compare $\pi(100)$ with the approximation $100 / \ln 100$.
- Actual value: $\pi(100) = 25$.
- Approximation: $$\frac{100}{\ln 100} \approx \frac{100}{4.6052} \approx 21.7$$
- The approximation is a bit low but in the right ballpark.
Estimate how many primes are below $x = 10{,}000$ using $x / \ln x$.
Solution
Estimate how many primes are below $x = 10{,}000$ using $x / \ln x$.
- Use: $$\frac{10{,}000}{\ln 10{,}000} \approx \frac{10{,}000}{9.2103} \approx 1085.7$$
- Estimated number of primes: about $1086$.
- (For reference, the true value is $\pi(10{,}000) = 1229$, so the estimate is reasonable but low.)
Explain in your own words why primes become less frequent as numbers grow.
Solution
Explain in your own words why primes become less frequent as numbers grow.
One good explanation:
- Larger numbers have more possible divisors.
- With more chances to be divisible by something, it becomes less likely a number has no divisors except 1 and itself.
- So as numbers grow, it’s “harder” to be prime, and primes become rarer.
Use the density $1/\ln x$ to estimate how many primes lie between 1,000,000 and 1,010,000.
Solution
Use the density $1/\ln x$ to estimate how many primes lie between $1{,}000{,}000$ and $1{,}010{,}000$.
We approximate the number of primes in an interval $[a,b]$ by: $$\text{(length of interval)} \times \frac{1}{\ln(\text{typical } x)}$$
- Interval length: $1{,}010{,}000 - 1{,}000{,}000 = 10{,}000$.
- Take a typical $x$ near the middle, say $x \approx 1{,}005{,}000$.
- Compute: $$\ln(1{,}005{,}000) \approx 13.82$$ $$\text{Estimated primes} \approx \frac{10{,}000}{13.82} \approx 723$$
- Answer: roughly $700$–$750$ primes in that interval.
True or false: The PNT tells us exactly where the next prime is.
Solution
True or false: The PNT tells us exactly where the next prime is.
- Answer: False.
- The PNT describes the average distribution of primes and gives an approximation for how many primes are up to $x$, but it does not tell us the exact location of any particular prime.
Explain what the symbol $\sim$ means in the statement $\pi(x) \sim x/\ln x$.
Solution
Explain what the symbol $\sim$ means in the statement $\pi(x) \sim x/\ln x$.
- The statement $\pi(x) \sim \dfrac{x}{\ln x}$ means: $$\lim_{x \to \infty} \frac{\pi(x)}{x/\ln x} = 1$$
- In words: as $x$ becomes very large, the ratio of $\pi(x)$ to $x/\ln x$ gets closer and closer to 1.
- So $x/\ln x$ is an asymptotic approximation to $\pi(x)$.
Bonus: Find an $x$ where the approximation $x/\ln x$ is within 5% of the true value of $\pi(x)$.
Solution
Bonus: Find an $x$ where the approximation $x/\ln x$ is within 5% of the true value of $\pi(x)$.
One example (you might have found others):
- Take $x = 1{,}000{,}000$.
  - Actual: $\pi(1{,}000{,}000) = 78{,}498$.
  - Approximation: $$\frac{1{,}000{,}000}{\ln 1{,}000{,}000} \approx \frac{1{,}000{,}000}{13.8155} \approx 72{,}382$$
  - Relative error: $$\frac{78{,}498 - 72{,}382}{78{,}498} \approx 0.078 \approx 7.8\%$$
  - This is not within 5%, but it shows the approximation improving.
A better example is around $x = 10{,}000{,}000$:
- Actual: $\pi(10{,}000{,}000) = 664{,}579$.
- Approximation: $$\frac{10{,}000{,}000}{\ln 10{,}000{,}000} \approx \frac{10{,}000{,}000}{16.1181} \approx 620{,}420$$
- Relative error: $$\frac{664{,}579 - 620{,}420}{664{,}579} \approx 0.066 \approx 6.6\%$$
Still slightly above 5%, but closer.
You might find an $x$ in between (for example, a few million) where the relative error drops below 5%. The key idea: as $x$ grows, such $x$ definitely exist, and they become more common.